考虑$ k $过程,每个过程都会生成一系列相同和独立的随机变量。这些过程的概率度量具有必须估计的随机参数。具体而言,它们共享一个参数$ \ theta $,所有概率度量共同。此外,每个过程$ i \ in \ {1,\ dots,k \} $都有一个私有参数$ \ alpha_i $。目的是设计一种主动采样算法,以顺序估算这些参数,以形成所有样品数量最少的共享和私有参数的可靠估计。该采样算法具有三个关键组件:(i)〜数据驱动的采样决策,随着时间的推移,该决策逐渐指定应选择哪些$ k $过程进行采样; (ii)〜停止该过程的时间,该过程指定何时累积数据足以形成可靠的估计并终止采样过程; (iii)〜所有共享和私人参数的估计器。由于已知的顺序估计在分析上是棘手的,因此本文采用\ emph {条件}估计成本函数,从而导致了顺序估计方法,该方法最近被证明可以进行拖延分析。划定了渐近的最佳决策规则(采样,停止和估计),并提供了数值实验,以将所提出的程序的疗效和质量与相关方法进行比较。
translated by 谷歌翻译
本文研究了固定置信度设置中随机多臂匪徒中最佳的手臂识别(BAI)问题。考虑到指数匪徒的一般类。指数匪徒家族的最先进算法面临计算挑战。为了缓解这些挑战,提出了一个新颖的框架,该框架将BAI问题视为顺序假设测试,并且可以适合针对指数的土匪家族的可拖动分析。基于此框架,设计了BAI算法,以利用规范顺序概率比测试。该算法在两种设置中都具有三个功能:(1)其样本复杂性在渐近上是最佳的,(2)保证它是$ \ delta- $ pac,(3)它解决了最先进的计算挑战 - 艺术方法。具体而言,这些方法仅专注于高斯环境,需要从汤普森(Thompson)的手臂上进行采样,而这些方法被认为是最好的和挑战者的手臂。本文分析表明,识别挑战者在计算上是昂贵的,并且提出的算法对其进行了规定。最后,提供了数值实验来支持分析。
translated by 谷歌翻译
本文调查$ \纺织品{污染} $随机多臂爆炸中最佳臂识别问题。在此设置中,从任何臂获得的奖励由来自概率$ \ varepsilon $的对抗性模型的样本所取代。考虑了固定的置信度(无限地平线)设置,其中学习者的目标是识别最大的平均值。由于奖励的对抗污染,每个ARM的平均值仅部分可识别。本文提出了两种算法,基于连续消除的基于间隙的算法和一个,以便在亚高斯匪徒中最佳臂识别。这些算法涉及平均估计,从渐近估计的估计值达到真实均值的偏差上实现最佳误差保证。此外,这些算法渐近地实现了最佳的样本复杂性。具体地,对于基于差距的算法,样本复杂性呈渐近最佳到恒定因子,而对于基于连续的基于算法,​​它是最佳的对数因子。最后,提供了数值实验以说明与现有基线相比的算法的增益。
translated by 谷歌翻译
We consider the problem of constructing minimax rate-optimal estimators for a doubly robust nonparametric functional that has witnessed applications across the causal inference and conditional independence testing literature. Minimax rate-optimal estimators for such functionals are typically constructed through higher-order bias corrections of plug-in and one-step type estimators and, in turn, depend on estimators of nuisance functions. In this paper, we consider a parallel question of interest regarding the optimality and/or sub-optimality of plug-in and one-step bias-corrected estimators for the specific doubly robust functional of interest. Specifically, we verify that by using undersmoothing and sample splitting techniques when constructing nuisance function estimators, one can achieve minimax rates of convergence in all H\"older smoothness classes of the nuisance functions (i.e. the propensity score and outcome regression) provided that the marginal density of the covariates is sufficiently regular. Additionally, by demonstrating suitable lower bounds on these classes of estimators, we demonstrate the necessity to undersmooth the nuisance function estimators to obtain minimax optimal rates of convergence.
translated by 谷歌翻译
Search and Rescue (SAR) missions in remote environments often employ autonomous multi-robot systems that learn, plan, and execute a combination of local single-robot control actions, group primitives, and global mission-oriented coordination and collaboration. Often, SAR coordination strategies are manually designed by human experts who can remotely control the multi-robot system and enable semi-autonomous operations. However, in remote environments where connectivity is limited and human intervention is often not possible, decentralized collaboration strategies are needed for fully-autonomous operations. Nevertheless, decentralized coordination may be ineffective in adversarial environments due to sensor noise, actuation faults, or manipulation of inter-agent communication data. In this paper, we propose an algorithmic approach based on adversarial multi-agent reinforcement learning (MARL) that allows robots to efficiently coordinate their strategies in the presence of adversarial inter-agent communications. In our setup, the objective of the multi-robot team is to discover targets strategically in an obstacle-strewn geographical area by minimizing the average time needed to find the targets. It is assumed that the robots have no prior knowledge of the target locations, and they can interact with only a subset of neighboring robots at any time. Based on the centralized training with decentralized execution (CTDE) paradigm in MARL, we utilize a hierarchical meta-learning framework to learn dynamic team-coordination modalities and discover emergent team behavior under complex cooperative-competitive scenarios. The effectiveness of our approach is demonstrated on a collection of prototype grid-world environments with different specifications of benign and adversarial agents, target locations, and agent rewards.
translated by 谷歌翻译
This paper presents a novel federated reinforcement learning (Fed-RL) methodology to enhance the cyber resiliency of networked microgrids. We formulate a resilient reinforcement learning (RL) training setup which (a) generates episodic trajectories injecting adversarial actions at primary control reference signals of the grid forming (GFM) inverters and (b) trains the RL agents (or controllers) to alleviate the impact of the injected adversaries. To circumvent data-sharing issues and concerns for proprietary privacy in multi-party-owned networked grids, we bring in the aspects of federated machine learning and propose a novel Fed-RL algorithm to train the RL agents. To this end, the conventional horizontal Fed-RL approaches using decoupled independent environments fail to capture the coupled dynamics in a networked microgrid, which leads us to propose a multi-agent vertically federated variation of actor-critic algorithms, namely federated soft actor-critic (FedSAC) algorithm. We created a customized simulation setup encapsulating microgrid dynamics in the GridLAB-D/HELICS co-simulation platform compatible with the OpenAI Gym interface for training RL agents. Finally, the proposed methodology is validated with numerical examples of modified IEEE 123-bus benchmark test systems consisting of three coupled microgrids.
translated by 谷歌翻译
Dataset Distillation (DD), a newly emerging field, aims at generating much smaller and high-quality synthetic datasets from large ones. Existing DD methods based on gradient matching achieve leading performance; however, they are extremely computationally intensive as they require continuously optimizing a dataset among thousands of randomly initialized models. In this paper, we assume that training the synthetic data with diverse models leads to better generalization performance. Thus we propose two \textbf{model augmentation} techniques, ~\ie using \textbf{early-stage models} and \textbf{weight perturbation} to learn an informative synthetic set with significantly reduced training cost. Extensive experiments demonstrate that our method achieves up to 20$\times$ speedup and comparable performance on par with state-of-the-art baseline methods.
translated by 谷歌翻译
Recent years have seen rapid progress at the intersection between causality and machine learning. Motivated by scientific applications involving high-dimensional data, in particular in biomedicine, we propose a deep neural architecture for learning causal relationships between variables from a combination of empirical data and prior causal knowledge. We combine convolutional and graph neural networks within a causal risk framework to provide a flexible and scalable approach. Empirical results include linear and nonlinear simulations (where the underlying causal structures are known and can be directly compared against), as well as a real biological example where the models are applied to high-dimensional molecular data and their output compared against entirely unseen validation experiments. These results demonstrate the feasibility of using deep learning approaches to learn causal networks in large-scale problems spanning thousands of variables.
translated by 谷歌翻译
The central question in representation learning is what constitutes a good or meaningful representation. In this work we argue that if we consider data with inherent cluster structures, where clusters can be characterized through different means and covariances, those data structures should be represented in the embedding as well. While Autoencoders (AE) are widely used in practice for unsupervised representation learning, they do not fulfil the above condition on the embedding as they obtain a single representation of the data. To overcome this we propose a meta-algorithm that can be used to extend an arbitrary AE architecture to a tensorized version (TAE) that allows for learning cluster-specific embeddings while simultaneously learning the cluster assignment. For the linear setting we prove that TAE can recover the principle components of the different clusters in contrast to principle component of the entire data recovered by a standard AE. We validated this on planted models and for general, non-linear and convolutional AEs we empirically illustrate that tensorizing the AE is beneficial in clustering and de-noising tasks.
translated by 谷歌翻译
Abusive language is a concerning problem in online social media. Past research on detecting abusive language covers different platforms, languages, demographies, etc. However, models trained using these datasets do not perform well in cross-domain evaluation settings. To overcome this, a common strategy is to use a few samples from the target domain to train models to get better performance in that domain (cross-domain few-shot training). However, this might cause the models to overfit the artefacts of those samples. A compelling solution could be to guide the models toward rationales, i.e., spans of text that justify the text's label. This method has been found to improve model performance in the in-domain setting across various NLP tasks. In this paper, we propose RAFT (Rationale Adaptor for Few-shoT classification) for abusive language detection. We first build a multitask learning setup to jointly learn rationales, targets, and labels, and find a significant improvement of 6% macro F1 on the rationale detection task over training solely rationale classifiers. We introduce two rationale-integrated BERT-based architectures (the RAFT models) and evaluate our systems over five different abusive language datasets, finding that in the few-shot classification setting, RAFT-based models outperform baseline models by about 7% in macro F1 scores and perform competitively to models finetuned on other source domains. Furthermore, RAFT-based models outperform LIME/SHAP-based approaches in terms of plausibility and are close in performance in terms of faithfulness.
translated by 谷歌翻译